arXiv: 1502.04890vl [math.ST] 17 Feb 2015 


Algorithm for overlapping estimation of common 
change-sets in spatial data of fixed size 

Leonid Torgovitski*^ 

Mathematical Institute, University of Cologne 
Weyertal 86-90, 50931, Cologne, Germany 


Abstract 

We propose a flexible class of estimates for “common change in the mean” sets 
in spatio-temporal data. We rely on a scan type approach by subdividing the spatial 
observations into suitable overlapping regions to which classical CUSUM (cumulative 
sums) estimates may then be applied separately. The aggregated “local” estimates 
are used to construct consistent “global” estimates of the change set(s) by taking the 
overlapping structure into account. The domain and the change regions may have 
irregular shapes and the suggested procedure is especially suited for estimation of 
multiple change regions. The performance is demonstrated in a simulation study. 
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1 Introduction 


The estimation of common change points (e.g. points which correspond to changes of 
the mean) in the framework of multivariate time series Xi, X%,... ,Xn £ R d , with 
fixed sample size N and an increasing dimension d —> oo, received some attention 
in recent literature (cf., e.g., Bai ( 2010| ), Bleakley and Vert (2010), Hadri et al (2012), 
Kim ( |2014 ) and Torgovitski ( 2015| ) ). ' Within such a setting, Bai (2010) studied a 


single common change point model and considered a classical least squares estimate, 
whereas Bleakley and Vert (2010 2011a) considered a multiple common change point 
model and adapted the total variation denoising approach to it. As pointed out in 


Torgovitski (2015), both methods may be seen as special cases of some general class of 
weighted CUSUM (cumulative sums) estimates. In the latter article, the consistency 
is studied for a whole class of such estimates which will also play a major role in the 
present work. 

In this paper we turn to time series of spatial data X\, X- 2 ,. ■ ., Xj, £ W mxn , 
where the parameters m,n are fixed and d —> oo. (For our approach, the time 
parameter d will turn out to correspond to the previously mentioned dimension 
parameter.) Our aim is to develop an algorithmic framework for the estimation of 
change sets S, where the means E(Xk(i, j)) for (i,j) £ S differ from the means 
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corresponding to ( i,j) £ S. The proposed algorithm is based on theoretical results in 
the aforementioned works of Bai (2010), Bleakley and Vert (2011a) and Torgovitski 


(2015). The method is especially suitable for the estimation of multiple change sets, 
where their number does not necessarily has to be known in advance, and the approach 
may also be easily extended to other, more complex situations, e.g., with an irregular 
domain and irregular change sets. As a special case, the same principle may also be 
applied in a straightforward manner to the common multiple change point estimation 
within the panel data framework in Torgovitski (2015) or Bleakley and Vert (2011a) in 
order to obtain consistent estimates for all changes when the number of panels tends 
to infinity. 

For demonstration purposes, the suggested approaches are implemented with a 
graphical user interface as a Matlab application which can be obtained from the author 
or via www.mi.uni-koeln.de/~ltorgovi 

For some problems and approaches for change-set detection that are remotely 
related to our situation under consideration we refer the reader e.g. to Arnold and 


Wied (2012), 


and Spokoiny 


Arnold et al (2014) and the references therein (in particular to Polzehl 


(2000) and to the review article of Qui (2007)). 


Notation 


First, we need to introduce some notation in order to formulate our model. Consider 
a set Sc M 2 of two-dimensional points, i.e. 

S = {( i,j ) | i = h, ■ ■. ,i m G N, j = ji,... ,j n e N, m,n £ N}. 

We call any point (i,j±l),(i±l,j) £ S to be adjacent or a, neighbour to ( i,j)£S. 
Correspondingly, two sets Si, S 2 C S are called adjacent if at least two nodes u £ Si, 
v £ S -2 exist that are adjacent to each other. Furthermore, we will associate S with 
an undirected graph such that each point (i,j) £ S corresponds to a node and such 
that all adjacent nodes are connected by edges. The boundary of S is a subset B C S 
which contains only nodes (i,j) £ S that have less than four distinct neighbours, 
i.e. nodes that are not 4- connected . Correspondingly, an interior node (i. j) £ S 
has to have four neighbours within S. The set S is called connected whenever 
the associated graph is connected, i.e., if there exists a path between any two nodes 
ni,n 2 £ S. 

As usual, we define the distance d(u,v) of two nodes u,v £ S w.r.t. the set S 
as the shortest path between them (within S). Accordingly, we define the distance 
of two sets Si , S 2 C S w.r.t. the set S as 


d(S±, S 2 ) = inf{<f(u, w)| u £ Si, v £ £> 2 }, 


with inf 0 = 00 in which case the sets are obviously disjoint. The Jaccard distance 
between two sets A,BcS is defined by 


dj(A,B) 


\AUB\- \ A D B\ 
\A U B\ 


( 1 . 1 ) 


This is a common measure of distance between two subsets, which will be used to 
quantify the precision of our estimates later on. We are now in a position to formulate 
our actual model. 


Statistical model 

We consider a spatio temporal signal plus noise model given by 

Xk(i,j) =m k (i,j)+£ k (i,j) (1.2) 
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for k = 1,... ,d and (i,j) G D, where D is assumed to be a rectangular domain, 
i.e. 

D = {(i,j )| i = l,...,m, j = 

with m, n > 4 being fixed integers. Here, the rrik(i,j) are the deterministic signals 
and £k{i- j) are the random variables representing the noise. One may interpret the 
sequence {X k } as a random field defined on the lattice D and consider k to be 
the time parameter. Throughout, we assume the family of random variables 


{£fc(b j), j = 1,... ,n} 

to be i.i.d. for each k, identically distributed in k with 0 < E(e i(l, l) 2 ) = cr 2 < oo 
and centered. The data may be dependent in time, for which appropriate conditions 
will be imposed later on. Additionally, we have to assume uniformly bounded finite 
fourth moments. 

For the general setting we may assume a partitioning of the domain D as 


D — Si + ... + Ski (1-3) 

for some K > 1, where each set S k , k = 1,..., K is non-empty and is assumed to 
be connected. Further, we assume piecewise constant means, i.e. 

K 

m k (i,j ) =J2m k (Si)l Sl (i,j), (1-4) 

i —i 


with rrik(Si) £ R, such that for any k it holds that rrik(Si) ^ mk{S u ) for all adjacent 
sets Si,S u with u^l. Our goal is the estimation of the partition S\,...,Sk based 
on the noisy sequence X k . (The Assumption (1.4) can be related e.g. to Polzehl and 


Spokoiny (2000 eq. (2)) but the statistical model there is non temporal.) As already 
mentioned, for m = 1 the setting (1.2) and (1.4) fits into the multiple change point 
scenario in panel data with fixed time parameter where the number of panels d tends 


to infinity (cf., eg., Bleakley and Vert (2011a) and Torgovitski (2015)). 

The rectangular domain D is chosen for simplicity of exposition only and as 
already mentioned all results discussed in this paper can be easily extended to more 
complex situations in a straightforward manner. Recall that we consider asymptotics 
for d —► oo. 


Definition 1.1. Assume the partitioning (1.3). We will call the sets Si for l = 
1 ,... ,K to be common change sets if: 

1. The sets Si are connected. 


2. For all total average changes, defined as 


A; 


,(Si,S p ) = lim 

a—too 


2 -^ k =i 


(1.5) 


with A k (Si, S p ) = m,k{Si) — m k (S p ), it holds, for all adjacent sets Si,S p with 
l p, that 

0 < ^lc(Si,S p ) < oo, (1.6) 

which quantifies the notion “common” in our setting. 

Since the extension to multiple change sets is straightforward, we will (mostly) 
restrict ourselves in Sections [2]4] below to the single change set case, i.e. to K = 2, 
where D = S + S c and S, S c are both formally common change sets. However, 
here it is more convenient to think that S c reflects the normal state region and S 
is the only common change set differing from that normal state. We will use this 
terminology for brevity and simply write = A^ 0 (5, S c ) in this situation. 
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Motivation 


The model (1.2)-(1.4) with change sets Si. S 2 


states a natural spatial extension 

of the setting considered e.g. by iTorgovitski (12015). 

One may think of digital imaging and assume a rectangular image sensor, i.e. an 
array of pixel sensors, corresponding to the domain D. Further, assume the image 
sensor to record a large test-sequence X \,..., Xd of images that represent the light 
intensity (e.g. as monochrome grayscale images) and assume that the measurements, 
i.e. the images, are affected by some random noise. 

Altogether, each image corresponds to an observation X k where X k (i. j) repre¬ 
sents the measured intensity by the (i,j)-th pixel, where is the true image 

intensity (cf., e.g., Qui (2007)). Now, one may think of change sets Si, £2 • • • to 


correspond e.g. to objects in the image that should be segmented or to a set of faulty 
pixels. The estimates discussed in this article can be used to estimate such sets based 
on a sufficiently long sequence of observations, i.e. when d is large. 

In contrast to more established settings and approaches (cf., e.g., 


Polzehl and 


Spokoiny (2000) and Qui (2007)) our aim here is to identify the partitioning based 
on a whole sequence of images. It is important that our model allows the means rn k 
to change in each observation k at any point For instance, we may think of 

changing lighting conditions while the X k , k = 1, ...,d are recorded. (Otherwise, 
as shown in Figure [5j assuming the means nifc(i,j) to be constant across all k, 
one could e.g. simply rely on averages X k (i,j) for each point (i,j) to obtain a 
partitioning.) 

This article is organized as follows. We begin with preliminaries in Secti on [2| where 
we briefly recall some theoretical results of Bleakley and Vert (2011a) and Torgovitski 
( 2015[ ) on which our algorithms will be based. In Section |3j we describe the estimation 
algorithm together with the conditions that ensure consistency of the estimates. In 
Section [4] we show some simulation results to demonstrate the performance for finite 
d and especially to show that even moderate cf s yield reasonable results. 


2 Preliminaries 


Assume a “single” common change set scenario, i 
domain D = S + S c with common change sets 
horizontal sub-slices 


e., observations X k in (1.2) on a 
S , S c . Further, consider (i, r)-th 


Y (i,r) - f Y (i,r) Y {i,r) ] T 

1 j ~ l-Q.i > • • 1 ’ I j,d \ > 




(2.1) 


with ■ ■ ■ > \ obtained from these observations X k by setting 


Y$ r) :=X k (i,r + j- 1) 


for each k = 1 ,... , d, j = 1 ,.. ., N and some r £ {1,..., n— V+l} with 4 < N < n. 
Accordingly, we set vfk'* := £ k{i,T + j — 1 ) for the corresponding innovations. The 
series Yi,...,Yjv can be interpreted as panel data with d panels and finite time 
horizon N. Notice, that the time parameter of the original spatial observations has 
now become a dimension parameter. 

Assume that the particular (i, r)-th sub-slice {Y^ M '*}j=i,...,Ar intersects a change 
set S such that 


{i,r+j- 1) G 


S c (S) 
S ( S c ) 


for j = 1 ,...,u{i,r), 
for j = u(i, r) + 1,..., N, 


( 2 . 2 ) 
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holds true for some 1 < u(i, r) < N and where r £ {1,..., n — N+ 1}. This situation 
is illustrated for the (9, 3)-rd sub-slice of size N = 6 and with u( 9,3) = 2 in Figure 
[l] below. (We set formally u(i,r) = oo whenever (2.2) does not hold.) 


horizontal sub-slice 
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Figure 1: The scheme represents the domain 
change set S. 


D and sub-slices which intersect the 


Altogether, we have a classical change in the mean scenario for any {Y^ 
where u = u(i, r ) corresponds to a single change point, i.e. ( 2 . 1 ), fits into the frame¬ 


work considered in Torgovitski (2015). In order to estimate u one may use a weighted 
CUSUM estimate 


u = u(i,r)= argrnax w(p,N) A 


. DBhr'-ftFii 2 . 

\ k=l j=1 


(2.3) 


Here, we restrict our considerations to a typical class of weighting functions, i.e. 

w(p,N) = ((p/N)(l-p/N))~'y 

parametrized by some 7 S [0,1/2) which controls the sensitivity. The argrnax in 
(2.3) is defined, as usual, as the smallest index at which the maximum is attained and 
Y Ntk = ZZi Y i,k/N. 


To define classes of reasonable estimates for our original model (1.2) we would like 
to make use of the sub-slices (2.1) and of the estimates (2.3), together with the cor¬ 
responding theoretical results of Bleakley and Vert (2011a) and Torgovitski (2015). 
Therefore, we define the normalized noise to change ratio parameter p w.r.t. the 
set S and w.r.t. to the length of sub-slices N as 


P(S) = 


1 CT 2 

NAL' 


(2.4) 


We also need the following assumption corresponding to Torgovitski (2015 Assump¬ 
tion (2.14)): 

d p 



“ VN,k)) = o(l), 


(2.5) 


fc=l 3=1 
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as d —> oo, for every p = 1,..., 
e.g. clearly holds true if the X\ 


N — 1. (2.51 is a weak dependence condition that 


are m-dependent (in particular independent) and 


identically distributed in k (cf. Torgovitski (2015)). Notice that we write rjj^ instead 
Vj 1 ^ because (2.5) does not depend on the parameters 


of 


i, r. 


Our starting point is the fact that the above estimates (2.3) consistently estimate 


under the assumptions on the model ( 1 . 2 ) and under the Assumptions ( 2 . 2 ) and 


(2.5). More precisely, we rely on the following “switching” behaviour of the estimates: 


1. If there is a single change point u(i,r) w.r.t. the (i,r)-th sub-slice (2.1), 
such that condition ( 2 . 2 ) is fulfilled, then u estimates the true change-point 


consistently for any 7 £ [ 0 , 1 / 2 ) if the noise to change ratio is below a positive 
threshold R("f,u,N). As d —> 00 , it holds that 


P(u = «)—>• 1 


( 2 . 6 ) 


for any 7 £ [0,1/2) given that p < 11 ( 7 , 11 , N) (cf., eg., Bleakley and Vert] 


(2011a, Theorem 2) and Torgovitski (2015 Theorems 2.6 and 2.13)). The opti¬ 
mal threshold R strongly depends on the parameter 7 . The particular values 
for R(0,u,N) may be obtained from Bleakley and Vert ( |2011a| Theorem 2) in 
a closed form. Furthermore, it holds that R(l/2,u, N) = 00 (cf., e.g., Bleakley 


(2015 


and Vert (2011a| Theorem 3) in the Gaussian i.i.d. case and take Torgovitski 


Theorem 2.6) into account regarding the nonparametric and dependent 
settings). We set 

1Z(j,N):= min R( 7 ,u,N ), 

u=l,...,N—l 

which again does not depend on parameters i, r. 

2. If there is no change point at all in the (i, r)-th sub-slice (2.1 ), i.e. if it holds that 
(i, r + j — 1) ^ S for j = l,...,N or that (*, r + j — 1) £ S for 
then u estimates a spurious change. It holds that, as d — > 00 , 


j = 1,...,N, 


P(u= [N/2\ Vu= \N/2]) 


1 


(2.7) 


for any 7 £ [0,1/2) (cf. Torgovitski (2015 Remark 2.7)). 


We do not have closed form expressions for R( 7 ) if 7 £ (0,1/2). However, from 


Torgovitski (2015 disp. (2.12)) it is clear that R( 7 ) tends to infinity as yfl/2 (cf. 
also further approximations to R(j) 


Torgovitski (2015 Proposition 2.15)). 


set estimates S for S in Section |3| Notice that the above switching property holds 
true for series <113 of any length N > 4, i.e. also for small single digit series of size 
N £ {4,... ,9}. In order to have a unique limit in (2.7) we will consider only even 
N. Finally, we would like to mention that any other estimate u with an analogous 
switching behaviour might be used for scanning and aggregation in the next section 
as well. 


The above switching behaviour in (2.6) and in (2.7) will provide consistent change 
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3 Estimation procedure 


We stick to the single common change set scenario of the previous section. The key 


to the estimation of change sets in model ( 1 . 2 ) will be a horizontal and/or a vertical 
overlapping scanning approach. The idea is to reduce the global problem of the change 
set estimation to many local single change point problems. This will allow us to lean 


on the results of Bleakley and Vert (2011a) and of Torgovitski (2015) which were 
summarized in Section |2j 

We propose a four step procedure which is outlined in the following. Notice that 
each step 1-4 may be performed horizontally or vertically even though some steps are 
described for the horizontal approach only. Moreover, we explicitly allow to combine 
the horizontal with the vertical approach by proceeding consecutively. (The vertical 
approach proceeds in the very same manner with the obvious modifications. Clearly, 
the notation of the previous Section [2] has to be adapted as well which will also be 
indicated below.) 


1. Slicing: 


• The time series {W,} is sliced into m non-spatial d-dimensional time 
series {Y^, k = 1 ,..., for i=l,...,m given by 


Yj$ :=x k{i,j), j = 1 , • • ■, n, & = d. 


(3.1) 


We will denote Y> 1 ^ as a the *-th horizontal slice in the following. Similarly, 


we may define a j -th vertical slice by := X^{i. j ) for any i = 1 ,,m, 

k = 1 ,..., d. 

• Now, we tacitly assume that N is even and subdivide each i-th horizontal 
slice into n — N overlapping sub-slices 


Y j i,r) = ■ ■ • • y 


(i,r) y (i ’ r) ] T 

j, 1 J ’ 


of size N , which are indicated by the parameter r, and where 


V 


*>r) ._ y(») 




(3.2) 


(3.3) 


j,k ' r+j — l,fe> 

for k = 1,..., d, r = 1, ..., n — N +1 and i = 1, ..., m. Similarly, we may 
subdivide the j-tli vertical slice into vertical sub-slices by 


- ryO» Y ij,r) 1 T i = 1 N 


where Y- 


(j,r 


i.k 


:= y 


U) 


r+i—l,fc’ 


i = 1,..., TV, for k = 1,..., d, r = 1,..., n—IV+1 


and j = 1,... ,n. Definition (3.2) resembles (2.1). However, in the 3rd step 


it will be convenient to think of all horizontal (vertical) sub-slices as parts 
of the same horizontal (vertical) slice, respectively. 


2. Scanning for critical points (Aggregation): 


Any sub-slice (3.2), or the vertical counterpart, is now treated as an individ¬ 
ual time series to which we apply a single change-point estimate (2.3) with 
any 7 £ [0,1/2) as described in Section [ 2 ] (Also we tacitly assume the 
necessary modifications for vertical slices). Since we have n — N sub-slices 
for each i, we aggregate n — N estimated change point locations 

{fi(i, r) £ N, r = 1,..., n — N + 1}, 


again, for any i. These locations will be called critical points in our spatial 
context. 
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• The locations u(i,r) are integer-valued numbers since they are computed 
w.r.t. the (z,r)-th sub-slices. Hence, we need to map them back on our 
grid domain D via 


U(i, r) := ( i , u(i, r) + r — 1) £ D 

for r = 1,..., n—N+1, i = 1,..., m. For theoretical reasons, we will restrict 
the admissible change sets S by requiring ( i,j) £ S if j > n — N +1 or if 
i > m — N + 1. Also, for technical reasons, we have to set U(i,r) := (z, 0) 
for 7' = n — N + 2,..., n and i = 1,..., m. 


3. Selection of relevant critical points: 


In this step, our aim is to identify the boundary B of the change set S. 
It will induce an estimate S' in a straightforward manner. Observe that only 
those critical points U (z, r) which are adjacent to the change set S (or he in 
B) may help us to identify this boundary and therefore the set S. Asymptoti¬ 
cally, i.e. as d —> oo and with probability tending to 1, the points v £ B will 
correspond to those u(i,r) and U(i,r) that are based on correct estimation 
(2.6) and not on the spurious ones as in (2.7). Hence, we have to filter out the 
latter by selecting a set G of relevant critical points, based on U. that is 
expected to be informative, based on suitable decision rules. 


We present the overlapping ( N,Q) rules, in form of a pseudocode. Let H(i) 
denote the set of relevant points w.r.t. the z-th horizontal slice and recall that we 
assume N > 4 to be even. The overlapping (N, Q) rule is: 


1 : Choose some integer 1 < Q < N — 2 
2 : for i = 1 to m do 
3: fT(i)<-0 

4: for r = 1 to n — A^ T 1 do 

5: if U(i,r) = U(i,r + 1) = ... = U(i,r + Q) then 

6 : H{i) <r- H{i)U{U(i,r)} 

7: end if 

8: end for 

9: end for 


The idea behind this algorithm is according to (2.6) and (2.7) that, assuming that 


the noise to change ratio lies below the threshold only the following 

cases may occur for the estimate u applied to (3.2): 


(a) There is a single change at 1 < zt < AT — 1. In that case we know that 
asymptotically, as d —> oo, this point is estimated correctly as a critical 
point with probability tending to 1 . 

(b) There is more than one change point. This case will be excluded from our 
consideration (cf. conditions on the change sets (3.41 in Theorem |3.1| below). 

(c) There is no change in this sub-slice. Hence, asymptotically as d-> oo, we 
estimate u = \_N/2\ spuriously with probability tending to 1 . 


For simplicity assume that Q = 1 (the case Q > 1 works in the same way). 
If some consecutive sub-slices, e.g. the (z,r)-th and (z, r + l)-th, intersect the 
change set region S, such that both have a single change point, i.e at u(i,r) 
and at u(z,r + 1 ), then we are in case a) for both sub-slices and therefore 
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P(U(i,r + 1) = U(i,r)) —> 1 , as d —> oo which means that the condition of the 
5th line, in the above algorithm, is fulfilled for Q = 1. On the other hand, if 
there is no change in at least one of the two subslices, we have P(U(i,r + 1) = 
U(i,r )) —> 0, as d —» oo, and the 5th line is always violated for any Q . Hence, 
the sets H(i), V(j ) will asymptotically contain only points that correspond to 
change-points in the sub-slices. 

The parameters N and Q , allow us to control the sensitivity. In particular, a 
smaller Q is less restrictive and therefore more sensitive, but less reliable. The 
overlapping rule is sketched in Figures [2] and [3] below. The illustration is based 
on a fragment of rectangular spatial observations of size m = 14, n — 20. Each 
field corresponds to a point (i, j) in a straightforward manner. The sensitivity 
w.r.t. the parameters ( N,Q ) and 7 is demonstrated in Figures [ 6 ][ 8 j below. 

Subsequently, we write V(j) for the set of relevant change points w.r.t. the j- th 
vertical slice. In case that we perform horizontal scanning only, we set formally 
V ( j ) = 0 for j = 1,..., n and analogously we set H{i) = 0 for i = 1,..., to 
if we would perform vertical scanning only. 

The pooled set of all relevant critical points will be denoted by 
G = H{ 1) U ... U H(m) U V(l) U ... U V(n). 


4. Connecting relevant critical points: 


The set G should asymptotically contain only nodes adjacent to the bound¬ 
ary B of S. Hence, based on G, or on H(i) and V(j), we try to iden¬ 
tify as many nodes of S as possible. Here, this is carried out for each slice 

separately^ _ (We tacitly restrict the class of change-sets S according to the 

Theorem 3.1). Let S denote the estimate for S. The horizontal procedure is: 


1: 5<-0 

2 : for i = 1 to m do 

3: if H(i) = {(*, xi), (i, X 2 ), ••-,(*, £p)}, X\ < X 2 < ■ ■ ■ < x p with p> 2 

then 

4: S:=SU{(i,a;i + l),...,(!,a: p )} 

5: end if 

6: end for 

and the analogous vertical procedure is: 

1: for j = 1 to n do 

2 : if V(j) = {(xi,j),(x 2 ,j),-.-,(x p ,j)}, X! < x 2 < ■■■ < x p with p > 2 

then 

3: 5:=S , U{(xi + l,j),...,(a:p,j)} 

4: end if 

5: end for 
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__ ( 14 , 14 ) 



Figure 2: The common change set S is indicated by green color. The darker fields 
indicate the boundary of S. The yellow fields indicate relevant critical 
points that will be eventually selected, i.e. with probability tending to 1 as 
d —> oo, applying the overlapping (4, 2) rule with any 7 £ [0,1/2). 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 

u n 


(a) scanning result for 
N — 6 with any 
7 < 1 / 2 ; step 2 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 

u n 

Is chosen as a critical point 

I * | Is chosen as a relevant critical point 

II I Is considered as a spurious estimate 

(b) overlapping (6,4) 
rule with any 
7 < 1/2; step 3 


Figure 3: The figures illustrate the algorithm for the overlapping rule. All figures 
show the 10-th horizontal slice of Figure [2] and the corresponding n — N 
overlapping sub-slices. It is indicated which points are selected as relevant 
critical points asymptotically with probability tending to 1 as d —1 00 ) 


Theorem 3.1. Assume a rectangular domain D with min{m, n} > f > 4, a 
boundary B and a common connected change set S with d(S,B) > £ — 1 for some 
£ £ N. Define vertical and horizontal intersections by 


(3.4) 


Hi := S n{(i,l)\ 1 = 1 , 

Vj ■■= SC{(l,j )| 1 = 1,..., to }, 

for all j = 1,..., n, i = 1,..., m, respectively. Furthermore, Assume that all Hi and 
Vj are either empty or connected sets. We have to state three different assumptions: 

1. Assume that we use the horizontal approach and that for all i = 1,... ,m it 
holds that \Hi\ > £ if Hi 0 . 

2. Assume that we use the vertical approach and that for all j = 1,... ,n it holds 
that \Vj\ > £ if Vj ^ 0. 

3. Assume that we combine the horizontal together with the vertical approach and 
that for all i = 1 ,...,m, j = 1,... ,n it holds that max{|i2i|, | Vj |} > f if 

(i,j) £ Hi n Vj. 

Assume that we use the overlapping (N,Q) rule, with 4 < N < £ and 1 < Q < N — 2 
where N is even. Furthermore, let 7 £ [0,1/2) and the ratio p be below 7 ?.( 7 ,£). 
Under either of the above Assumptions 1-3, given that (2.5) holds true w.r.t. all 
sub-slices ( 2 . 1 ), it holds that, as d —» 00 , 

P(S = S)-> 1. (3.5) 
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Proof. Conditions on Hi and Vj ensure that only two cases may occur. Either a 
sub-slice contains a single change point or does not contain a change point at all. The 
number of sub-slices is fixed and finite. Hence, the overall consistency of S follows 
from the consistency of all estimates u(i, r ) in case of a change and from the fact 
of spurious estimation when there is no change (cf. Torgovitski (2015, Theorems 2.6, 
2.13 and Remark 2.17)). □ 


4 Simulations 


We start this section by illustrating the estimation procedure and the corresponding 
Theorem |3.1| of the previous Section [3] 

We begin with parameters that will be common in our simulations. For simplicity, 
we consider a domain D with m,n = 100 and assume the noise to be i.i.d. normally 
distributed with e 1 (l,l) = IV(0, er 2 ). We consider a single common change set scenario 
(see Section [l] and [2]) and define test change sets S with radius w, centered at a 
point v £ D, by 

S w> y := {it e D | ||it - i>||p < w}. 

Here, \\u — n|| p is the usual p-Norm for vectors in R 2 and p = 00 denotes the 
maximum norm. We call such sets rectangular-shaped for p = 00, round-shaped for 
p = 2 and diamond-shaped for p = 1. The reference mean level nik(S c ) is set to 
ink = k for k = 1,... ,d. 


Remark 4.1. Recall, that dj denotes the Jaccard distance defined in Clearly, 

relation (3.5) implies P(dj(S,S) = 0) —>• 1 as d — > 00 which in turn yields 
E(dj(S,S)) — > 0 as d — > 00 . The latter follows e.g. due to uniform integrability of 
dj(S,S) £ [0,1]. In our simulations we demonstrate the influence of various param¬ 
eters on the expected Jaccard distance Edj = E(dj(S,S)) which is approximated 
based on 100 repetitions. 


Table [l] shows Edj for the overlapping ( N,Q ) rules w.r.t. different parameters 
d, N , Q and 7 . Generally, it is not clear which combination of sensitivity parameters 
( N , Q) and 7 is preferable. Hence, our advise is to plot different estimates and to 
rely on visual inspection (cf. Figures |6]-|8] below). Nevertheless, we see two tendencies 
where either the expected distance Edj improves for larger d, e.g. for N = 4, Q = 1 
and 7 = 0.3, or worsens, e.g. for TV = 4, Q = 1 and 7 = 0 . In accordance with the 
theory, the former happens if the ratio p is below the threshold TZ and the latter 
when p is above. Notice, that the precision does not monotonously increase in 7. 

The Figures i-i are based on a spatio-temporal sequence which is illustrated 
in Figure [5] Comparing the Figures [ 6 ] and [7] for 7 = 1/4 we see that a larger d 
improves the estimation. Clearly, a smaller Q yields more sensitive estimates but on 
the other hand larger parameters Q may isolate the change sets better. Table [T] and 
Figures [ 6 ][S] show that the usage of the horizontal procedure together with the vertical 
procedure might be better or worse than the plain horizontal approach. However, for 
some change sets, e.g. diamond-shaped, it is necessary to use both directions in order 
to obtain a consistent estimate (cf. Figure [4]). 
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7 = 0 


7 = 0.1 


7 = 0.2 


7 = 0.3 


7 = 0.4 


overlapping 

(4,1) 











d = 

100 


0.46 

(0.51) 

0.47 

(0.54) 

0.49 

(0.54) 

0.50 

(0.55) 

0.51 

(0.55) 

d = 

200 


0.68 

(0.52) 

0.44 

(0.46) 

0.45 

(0.53) 

0.49 

(0.55) 

0.51 

(0.55) 

d = 

300 


0.89 

(0.79) 

0.52 

(0.38) 

0.41 

(0.49) 

0.48 

(0.54) 

0.51 

(0.55) 

d = 

500 


0.97 

(0.94) 

0.72 

(0.52) 

0.30 

(0.28) 

0.43 

(0.53) 

0.50 

(0.55) 

d = 

1000 


0.99 

(0.99) 

0.82 

( 0 . 66 ) 

0.24 

(0.06) 

0.23 

(0.35) 

0.49 

(0.54) 

overlapping 

(4,2) 











d = 

100 


0.92 

(0.85) 

0.74 

(0.59) 

0.53 

(0.47) 

0.44 

(0.50) 

0.45 

(0.53) 

d = 

200 


0.99 

(0.98) 

0.92 

(0.85) 

0.66 

(0.48) 

0.42 

(0.43) 

0.42 

(0.52) 

d = 

300 


1.00 

(0.99) 

0.95 

(0.91) 

0.73 

(0.54) 

0.38 

(0.33) 

0.41 

(0.50) 

d = 

500 


1.00 

( 1 . 00 ) 

0.97 

(0.94) 

0.75 

(0.56) 

0.28 

(0.16) 

0.36 

(0.47) 

d = 

1000 


1.00 

( 1 . 00 ) 

0.98 

(0.97) 

0.67 

(0.45) 

0.12 

( 0 . 02 ) 

0.23 

(0.35) 

overlapping 

(6,2) 











d = 

100 


0.42 

(0.48) 

0.42 

(0.51) 

0.43 

(0.52) 

0.45 

(0.53) 

0.46 

(0.54) 

d = 

200 


0.33 

(0.36) 

0.36 

(0.45) 

0.40 

(0.50) 

0.43 

(0.53) 

0.46 

(0.54) 

d = 

300 


0.24 

( 0 . 20 ) 

0.27 

(0.36) 

0.36 

(0.48) 

0.41 

(0.52) 

0.45 

(0.54) 

d = 

500 


0.10 

(0.04) 

0.11 

(0.16) 

0.26 

(0.38) 

0.38 

(0.50) 

0.44 

(0.53) 

d = 

1000 


0.01 

( 0 . 00 ) 

0.01 

( 0 . 01 ) 

0.07 

( 0 . 12 ) 

0.28 

(0.41) 

0.42 

(0.52) 

overlapping 

(6,4) 











d = 

100 


1.00 

( 1 . 00 ) 

1.00 

( 1 . 00 ) 

1.00 

( 1 . 00 ) 

0.95 

(0.89) 

0.65 

(0.46) 

d = 

200 


1.00 

( 1 . 00 ) 

1.00 

( 1 . 00 ) 

1.00 

( 1 . 00 ) 

0.95 

(0.90) 

0.53 

(0.30) 

d = 

300 


1.00 

( 1 . 00 ) 

1.00 

( 1 . 00 ) 

1.00 

( 1 . 00 ) 

0.94 

(0.89) 

0.40 

(0.18) 

d = 

500 


1.00 

( 1 . 00 ) 

1.00 

( 1 . 00 ) 

1.00 

( 1 . 00 ) 

0.93 

( 0 . 86 ) 

0.23 

(0.06) 

d = 

1000 


1.00 

( 1 . 00 ) 

1.00 

( 1 . 00 ) 

1.00 

( 1 . 00 ) 

0.90 

(0.82) 

0.05 

( 0 . 00 ) 

Table 1: The 

: expected Jaccard distance Edj 

for the overlapping algorithm for the 


horizontal (horizontal + vertical) approach. For simplicity we set mk(S) = 
k + (—l) fe , k = 1,..., d, i.e. m,k(S) = rrik(S c ) + (—l) fc , and cr 2 = 2. The 
change set S W)V is rectangular-shaped centered at v = (50,50) with 
w = 100/3 



(a) horizontal ap- (b) horizon- 

proach tal+vertical 

Figure 4: The figure shows the overlapping (16, 8 ) estimate with 7 = 0 for multiple 
diamond- and round-shaped change sets D = Sr> + Sr + S c with w = 
100/6. The horizontal procedure is consistent for the round-shaped set 
but only the horizontal + vertical approach yields a consistent estimate 
for the diamond-shaped change set. It holds m^So) = m.k(S c ) + (—l) fc , 
mk(Sn) = mk(S c ), k = 1 ,..., d and d = 1000 with er 2 = 1 
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Figure 5: The first two rows show the observations A'i ,...,Xg with a 2 = 2 for 
rrik(S) = k + (—l) fc and nik(S c ) = k. The third row shows the averages 
X 50 o- In a) for the simple case of nik(S) = 0 and mk(S c ) = 1. In b) 
for the case of rrik{S) = 0 and rrik(S c ) = (—l) fe and finally in c) for 
TOfc(5) = k and nik(S c ) = k + (— l) fc . Notice that by averaging we loose 
(or at least do not gain) information in the last two settings 
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h+v, (4,1), 

7 = 0.49, rel. 


h+v, (4,2), 

7 = 0.49, rel. 


h+v, (4,1), 

7 = 0.49, est. 


h+v, (4,2), 

7 = 0.49, est. 


Figure 6: The figures show the relevant critical points (rel.) and the corresponding 
estimates (est.) selected by the horizontal (h) or the horizontal+vertical 
(h+v) approach using the overlapping (N, Q ) rule and based on a diamond¬ 
shaped change set. It is mk(S) = k + (—l) fc , k = 1,... ,d and d = 500 
with (T 2 = 2 
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h+v, (4,1), 

7 = 0.49, rel. 


h+v, (4,2), 

7 = 0.49, rel. 


h+v, (4,1), 

7 = 0.49, est. 


h+v, (4,2), 

7 = 0.49, est. 


Figure 7: The figures show the relevant critical points (rel.) and the corresponding 
estimates (est.) selected by the horizontal (h) or the horizontal+vertical 
(h+v) approach using the overlapping (IV, Q) rule and based on a diamond¬ 
shaped change set. It is m.k(S) = k + (—l) fc , k = 1,..., d and d = 1000 
with er 2 = 2 


15 






















































References 


Arnold M, Wied D (2012) Testing for structural change in spatial regions at unknown 
positions. Discussion Paper 19/2012, SFB 823. 

Arnold M, Raabe N, Wied D (2014) Identifying different areas of inhomogeneous min¬ 
eral subsoil: spatial fluctuation approaches. Communications in Statistics - Simu¬ 
lation and Computation. doi:10.1080/03610918.2013.861487 

Bai J (2010) Common breaks in means and variances for panel data. Journal of Econo¬ 
metrics 157:78-92 

Bleakley K, Vert J P (2010) Fast detection of multiple change-points shared by many 
signals using group LARS. Neural Inform Process Syst 22:2343-2352 

Bleakley K, Vert J P (2011a) The group fused LASSO for multiple change-point 
detection. arXiv:1106.4199vl 

Bleakley K, Vert J P (2011b) The group fused LASSO for multiple change-point 
detection. Technical report HAL-00602121 

Hadri K, Larsson R, Rao Y (2012) Testing for stationarity with break in panels where 
the time dimension is finite. Bull Econ Res, Issue Supplement si. 64:sl23-sl48 

Kim D (2014) Common breaks in time trends for large panel data with a factor 
structure. Econom J 17:301-337 

Polzehl J, Spokoiny V (2000) Adaptive weights smoothing with applications to image 
restoration. J R Statist Soc B 62:335-354 

Qiu P (2007) Jump surface estimation, edge detection, and image restoration. Journal 
of the American Statistical Association 102(478):745-756 

Torgovitski L (2015) Panel data segmentation under finite time horizon. 
arXiv:1501.00177v2 


17 



